Corpus: fra_news_2007_100K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 94 97 97 99 99
1000 880 976 993 998 998
10000 6515 8921 9766 9943 9972
100000 34667 68692 89991 97215 98976
1000000 34667 68693 89992 97216 98977


Zipf's diagram for sentence endings


Gnuplot diagram

5977 msec needed at 2018-03-02 21:04